Goto

Collaborating Authors

 adversary weaker


Alternation makes the adversary weaker in two-player games

Neural Information Processing Systems

Motivated by alternating game-play in two-player games, we study an altenating variant of the \textit{Online Linear Optimization} (OLO).


Alternation makes the adversary weaker in two-player games

Neural Information Processing Systems

Motivated by alternating game-play in two-player games, we study an altenating variant of the \textit{Online Linear Optimization} (OLO). In alternating OLO, a \textit{learner} at each round t \in [n] selects a vector x t and then an \textit{adversary} selects a cost-vector c t \in [-1,1] n . The learner then experiences cost (c t c {t-1}) \top x t instead of (c t) \top x t as in standard OLO. We establish that under this small twist, the \Omega(\sqrt{T}) lower bound on the regret is no longer valid. More precisely, we present two online learning algorithms for alternating OLO that respectively admit \mathcal{O}((\log n) {4/3} T {1/3}) regret for the n -dimensional simplex and \mathcal{O}(\rho \log T) regret for the ball of radius \rho 0 .